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Abstract. Non-gaussianity represents the statistical sig- 
nature of physical processes such as turbulence. It can 
also be used as a powerful tool to discriminate between 
competing cosmological scenarios. A canonical analysis of 
non-gaussianity is based on the study of the distribution 
of the signal in the real (or direct) space (e.g. brightness, 
temperature). 

This work presents an image processing method in 
which we propose statistical tests to indicate and quantify 
the non-gaussian nature of a signal. Our method is based 
on a wavelet analysis of a signal. Because the temperature 
or brightness distribution is a rather weak discriminator, 
the search for the statistical signature of non-gaussianity 
relies on the study of the coefficient distribution of an 
image in the wavelet decomposition basis which is much 
more sensitive. 

We develop two statistical tests for non-gaussianity. 
In order to test their reliability, we apply them to sets of 
test maps representing a combination of gaussian and non- 
gaussian signals. We deliberately choose a signal with a 
weak non-gaussian signature and we find that such a non- 
gaussian signature is easily detected using our statistical 
discriminators. In a second paper, we apply the tests in a 
cosmological context. 

Key words: Methods: data analysis, statistical, Tech- 
niques: image processing 



1. Introduction 

Non-gaussianity is a very promising way of characterising 
some important physical processes and has many applica- 
tions in astrophysics. In fluid mechanics, the non-gaussian 
signature of the probability density functions of veloc- 
ity is used as evidence for turbulence (e.g. Chabaud et 
al. (1994)). Astrophysical fluids with very large Reynolds 
numbers, such as the interstellar medium, are expected to 
be turbulent. If it exists, turbulence should play a lead- 
ing role in the triggering of star formation, in determining 
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the dynamical structure of the interstellar medium and its 
chemical evolution. Non-gaussianity is a tool for indicat- 
ing turbulence. Indeed, recent studies on the non-gaussian 
shape of the molecular line profiles can be interpreted 
as evidence for turbulence (Falgarone & Phillips 1990; 
Falgarone et al. 1994; Falgarone & Puget 1995; Miville- 
Deschenes et al. 1999). Non-gaussianity is also used as 
the indicator of coronal heating due to magneto-hydro- 
dynamical turbulence (Bocchialini et al. 1997; Georgoulis 
et al. 1998). Within the cosmological framework, the 
statistical nature of the Cosmic Microwave Background 
(CMB) temperature, or brightness anisotropics, probes 
the origin of the initial density perturbations which gave 
rise to cosmic structures (galaxies, galaxy clusters). The 
inflationary models (Guth 1981; Linde 1982) predict gaus- 
sian distributed density perturbations, whereas the topo- 
logical defect models (Vilenkin 1985; Bouchet 1988; Steb- 
bins 1988; Turok 1989) generate a non-gaussian distribu- 
tion. Because the nature of the initial density perturba- 
tions is a major question in cosmology, a lot of statistical 
tools have been developed to test for non-gaussianity. 

In order to test for non-gaussianity, one can use tradi- 
tional methods based on the distribution of temperature 
anisotropies. The simplest tests are based on the third 
and/or fourth order moments (skewness and kurtosis) of 
the distribution, both equal to zero for a gaussian dis- 
tribution. Another test for non-gaussianity through the 
temperature distribution is based on the study of the 
cumulants (Ferreira et al 1997; Winitzki 1998). The n- 
point correlation functions also give very valuable statis- 
tical information. In particular, the three-point function, 
and its spherical harmonic transform (the bispectrum), 
are appropriate tools for the detection of non-gaussianity 
(Luo & Schramm 1993; Kogut et al. 1996; Ferreira & 
Magueijo 1997; Ferreira et al. 1998; Heavens 1998; Spergel 
& Goldberg 1998). An investigation of the detailed be- 
haviour of each multipole of the CMB angular power spec- 
trum (transform of the two-point function) is another non- 
gaussianity indicator (Magueijo 1995). Finally, other tests 
of non-gaussianity are based on topological pattern statis- 
tics (Coles 1988; Gott et al. 1990). 



2 



Searching for non-gaussianity: Statistical tests 



The works of Ferreira & Magueijo (1997) on the search 
for non-gaussianity in Fourier space, and of Ferreira et 
al. (1997) on the cumulants of wavelet transformed maps, 
have shown that these approaches allow the study of char- 
acteristic scales, which is particularly interesting when 
studying a combination of gaussian and non-gaussian sig- 
nals as a function of scale. In the present work, we study 
non-gaussianity in a wavelet decomposition framework, 
that is, using the coefficients in the wavelet decomposi- 
tion. We decompose the image of the studied signal into 
a wavelet basis and analyse the statistical properties of 
the coefficient distribution. We then search for reliable 
statistical diagnoses to distinguish between gaussian and 
non-gaussian signals. The aim of this study is to find suit- 
able tools for demonstrating and to quantifying the non- 
gaussian signature of a signal when it is combined with 
a gaussian distributed signal of similar, or higher, ampli- 
tude. 

In section 2, we describe the methods for wavelet decom- 
position and filtering. We then briefly describe the char- 
acteristics of the wavelet we use in our study. We also give 
the main characteristics of the test data sets used in our 
work. In section 3, we present the statistical criteria we 
developed to test for non-gaussianity. We then apply the 
tests to sets of gaussian test maps (Sec. 4). We also ap- 
ply them, in section 5, to sets of non-gaussian maps as 
well as combinations of gaussian and non-gaussian signals 
with the same power spectrum. In section 6, we present 
and apply the detection strategy we propose to quantify 
the detectability of the non-gaussian signature. Finally, in 
section 7 we conclude and present our main results. 

2. Analysis 

2.1. The wavelet decomposition 

Wavelet transforms have been widely investigated recently 
because of their suitability for a large number of signal 
and image processing tasks. Wavelet analysis involves a 
convolution of the signal with a convolving function or a 
kernel (the wavelet) of specific mathematical properties. 
When satisfied, these properties define an orthogonal ba- 
sis, which conserves energy, and guarantee the existence 
of an inverse to the wavelet transform. 

The principle behind the wavelet transform, as de- 
scribed by Grossman & Morlet (1984), Daubechies (1988) 
and Mallat (1989), is to hierarchically decompose an input 
signal into a series of successively lower resolution refer- 
ence signals and their associated detail signals. At each 
decomposition level, L, the reference signal has a resolu- 
tion reduced by a factor of 2 L with respect to the original 
signal. Together with its respective detail signal, each scale 
contains the information needed to reconstruct the refer- 
ence signal at the next higher resolution level. Wavelet 
analysis can therefore be considered as a series of band- 
pass filters and be viewed as the decomposition of the sig- 
nal into a set of independent, spatially oriented frequency 



channels. Using the orthogonality properties, a function 
in this decomposition can be completely characterised by 
the wavelet basis and the wavelet coefficients of the de- 
composition. 

The multilevel wavelet transform (analysis stage) de- 
composes the signal into sets of different frequency locali- 
sations. It is performed by iterative application of a pair of 
Quadrature Mirror Filters (QMF). A scaling function and 
a wavelet function are associated with this analysis filter 
bank. The continuous scaling function 4>a(x) satisfies the 
following two-scale equation: 

<p A (x) = V2j2 h o(n)M 2x ~ n), (1) 

n 

where ho is the low-pass QMF. The continuous wavelet 
ij)A (x) is defined in terms of the scaling function and the 
high-pass QMF hi through: 

^ A {x) = ^/2 y ^hi{n)4> A (2x-n). (2) 

n 

The same relations apply for the inverse transform 
(synthesis stage) but, generally, different scaling function 
and wavelet (c/)s{x) and ips(x)) are associated with this 
stage: 

4>s{x)=V2^g Q {n)(j) S {2x-n), (3) 

n 

gi(n)4>s(2x - n). (4) 

n 

Equations (1) and (3) converge to compactly sup- 
ported basis functions when 

5>o(«)=Efo(«)=^. (5) 

n n 

The system is said to be biorthogonal if the following 
conditions are satisfied: 

/ (j)A{x)4>s(x — k)dx — 8{k) (6) 
Jr 

J <f> A {x)ip s {x - k)dx = (7) 

/ (p s (x)ipA(x- k)dx = (8) 
J& 

Cohen et al. (1990) and Vetterli & Herley (1992) give a 
complete treatment of the relationship between the filter 
coefficients and the scaling functions. 

When applied to bi-dimensional data (typically im- 
ages), three main types of decomposition can be consid- 
ered: dyadic (or octave band), pyramidal and uniform. 

1. A "dyadic" decomposition refers to a transform in 
which only the reference sub-band (low-pass part of 
the signal) is decomposed at each level. In this case, 
the analysis stage is applied in both directions of the 
image at each decomposition level. The total number 
of sub-bands after L levels of decomposition is then 
3L+1 (Fig. ||, upper panel). 
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Fig. 1. In the upper panel, a 4 level "dyadic" decomposition is 
used, in the lower, a 4 level "pyramidal" decomposition is used. 
In both cases, the diagonal squares and the last decomposition 
level are identical and the grey levels indicate the amplitude of 
the wavelet coefficients. 



2. A "pyramidal" decomposition is similar to a "dyadic" 
decomposition in the sense that only the reference sub- 
band is decomposed at each level, but it refers here 
to a transform that is performed separately in the two 
directions of the image. The total number of sub-bands 
after L levels of decomposition is then (L + l) 2 (Fig. 
|l], lower panel). 

3. A "uniform" decomposition refers to one in which all 
sub-bands are transformed at each level. The total 
number of sub-bands after L levels of decomposition 
is then 4 L . 



The wavelet functions are localised in space and, si- 
multaneously, they are also localised in frequency. There- 
fore, this approach is an elegant and powerful tool for im- 
age analysis because the features of interest in an image 
are present at different characteristic scales. Moreover, if 
the input field is gaussian distributed, the output is dis- 
tributed the same way, regardless of the power spectrum. 
This arises from the linear transformation properties of 
gaussian variables. The distribution of the wavelet coeffi- 
cients of a gaussian process is thus a gaussian. Conversely, 
we expect that any non-gaussian signal will exhibit a non- 
gaussian distribution of its wavelet coefficients. 

In our study, we have used bi-orthogonal wavelets, 
which are mainly used in data compression, because of 
their better performance than orthogonal wavelets, in 
compacting the energy into fewer significant coefficients. 
There exist bi-orthogonal wavelet bases of compact sup- 
port that are symmetric or antisymmetric. Antisymmet- 
ric wavelets are proportional to, or almost proportional 
to, a first derivative operator (e.g. the 2/6 tap filter (filter 
#5) of Villasenor et al. (1995), or the famous Haar trans- 
form which is an orthogonal wavelet). Symmetric wavelets 
are proportional to, or almost proportional to, a second 
derivative operator (e.g. the 9/3 tap filter of Antonini et 
al. (1992)). In the frame of detecting non-gaussian signa- 
tures, the choice of the wavelet basis is critical because non 
gaussian features exhibit point sources or step edges the 
wavelet must have a very good impulse response and a low 
shift variance, i.e. they better preserve the amplitude and 
the location of the details. Villasenor et al. (1995) have 
tested a set of bi-orthogonal filter banks, within this con- 
text, to determine the best ones for image compression. 
They conclude that even length filters have significantly 
less shift variance than odd length filters, and that their 
performance in term of impulse response is superior. In 
these filters, the high pass QMF is antisymmetric which is 
also a desirable property in the sense that we will also be 
interested in the statistical properties of the multi-scale 
gradients. Consequently, in our study, we have chosen the 
6/10 tap filter (filter #3) of Villasenor et al. (1995) (Fig. 

which represents the best compromise between all the 
criteria and energy compaction. Using this filter, we have 
chosen to perform a four level dyadic decomposition of our 
data. This particular wavelet and decomposition method 
have already been used for source detection by Aghanim 
et al. (1998). With such a decomposition, we also benefit 
from correlations between the two directions at each level, 
which is not the case with the pyramidal decomposition 
that treats both directions as if they were independent. 
Another advantage of this transform is that, for each level 
of decomposition, or scale, we benefit from the maximum 
number of coefficients possible, which is crucial for the 
statistics. 
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Scaling Function 




Wavelet Function 




Fig. 2. Scaling function (top) and wavelet function (bottom) 
corresponding to the filter #3 of Villasenor et al. (1995). Note 
that the wavelet function is antisymmetric. 



2.2. The test maps 

We use a test case which consists of sets of 100 maps of a 
gaussian signal and 200 maps of a non-gaussian signal, all 
having the same bell-shaped power spectrum and 512x512 
pixels. One of the non-gaussian sets (100 maps) consists 
of a distribution of disks with uniform amplitude (top- 
hat profiles), generating step edges. The disks have dif- 
ferent sizes and amplitudes and are randomly distributed 
in the map. The signal is weakly skewed because the av- 
erage skewness, which is the third moment of the distri- 
bution, computed over the 100 non-gaussian realisations 
is /Z3 = —0.10 ± 0.05 (one sigma error for one realisa- 
tion). Whereas the excess of kurtosis, the fourth moment 
(/Z4) of the distribution, is k = ^4/^ ~ 3 = 1.09 ± 0.17. 
The second set of 100 non-gaussian maps consists of a 
distribution of gaussian profiles with different sizes and 
amplitudes. The skewness and excess of kurtosis of the 
100 statistical realisations of the non-gaussian distribu- 
tion of gaussian profiles are respectively —0.06 ± 0.04 and 
1.19 ± 0.13. The same quantities computed over the 100 
gaussian maps are respectively ^3 = 0.14 10~ 2 ± 2.15 10~ 2 
and 0.03 10~ 2 ±3.87 10~ 2 . These numbers should be equal 
to zero for a gaussian distribution. They are not because 
there is a statistical dispersion over a finite set of reali- 
sations. The purpose of this paper being to develop suit- 
able statistical tests for non-gaussianity, we will not study 
other effects such as noise or beam dilution. These effects 
will be considered in an application of the method to the 
CMB signal in a second paper (Aghanim & Forni 1999). 



3. Tests of non-gaussianity 

The most direct and obvious way of analysing the statis- 
tical properties of an image is to use the distribution of 
the pixel brightnesses, or temperatures, together with the 
skewness and kurtosis. If the two quantities are different 
from zero, they indicate that the signal is non-gaussian. 
However, a weak non-gaussian signal will hardly be de- 
tected through the moments of the temperature distri- 
bution. Another way of addressing the problem is to use 
the coefficients in the wavelet decomposition and to study 
their statistical properties which in turn characterise the 
signal. In fact, the wavelet coefficients are quite sensitive 
to variations (even weak ones) in the signal, temperature 
or brightness, and hence to the statistical properties of the 
underlying process. We have developed two tests which ex- 
hibit the non-gaussian characteristics of a signal using the 
wavelet coefficients. Since our test maps are not skewed in 
the following we focus only on the results obtained using 
the fourth moment. 

For the first discriminator, we study the statistical 
properties of the distribution of the multi-scale gradient 
coefficients. This method is appropriate when dealing with 
a non-gaussian process characterised by sharp edges and 
consequently by strong gradients in the signal. Indeed, 
in any region where the analysed function is smooth the 
wavelet coefficients are small. On the contrary, any abrupt 
change in the behaviour of the function increases the am- 
plitude of the coefficients around the singularity. The de- 
tection of non-gaussianity is thus based on the search of 
these gradients. In the dyadic wavelet decomposition one 
can discriminate between the coefficients associated with 
vertical and horizontal gradients and the other coefficients. 
In our case, the vertical and horizontal gradients are anal- 
ogous to the partial derivatives, d/dx and d/dy, of the sig- 
nal. Mallat & Zhong (1992) give a thorough treatment of 
the characterisation of signals from multi-scale edges. We 
compute the quadratic sum of the coefficients, the quan- 
tity Gl — {d/dx) L + (d/dy) 2 L , at each decomposition level 
L. This quantity represents the squared amplitude of the 
multi-scale gradient of the image. In the following, we will 
however refer to it as the multi-scale gradient coefficient. 

The second statistical discriminator is based on the 
study of the wavelet coefficients related to the horizontal, 
vertical and diagonal gradients. These coefficients are as- 
sociated with the partial derivatives d/dy and d/dx, as in 
the multi-scale gradient method, as well as with the cross 
derivative d 2 /dxdy. The coefficients are computed at each 
decomposition level and their excess of kurtosis with re- 
spect to a Gauss distribution exhibits the non-gaussian 
signature of the studied signal. In this context, the wavelet 
coefficients associated with the first derivatives are obvi- 
ously closely related to the multi-scale gradient. 
In the following, we first apply our two tests to purely 
gaussian and non-gaussian maps. We then test the de- 
tectability of a non-gaussian signal added to a gaussian 
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one with same power spectrum and with increasing mix- 
ing ratios. 

4. Characterisation of gaussian signals 

4--1- The multi-scale gradient and its distribution 

For the 100 gaussian maps, we find that the histogram 
of the multi-scale gradient coefficients can be fitted by 
the positive wing of the Laplace probability distribution 
function: 

= ^ exp y j , (9) 

where \i\ is the mean of the distribution (theoretically 
equal to zero) and a 2 is its second moment. 

We plot in figure || (left panels) the distribution of the 
multi-scale gradient coefficients in the four decomposition 
scales for the gaussian signal. For reasons of legibility, we 
have just plotted the fit obtained with the 100 maps. The 
error bars represent a confidence interval (for one reali- 
sation) and account for the statistical dispersion of the 
realisations. 

We can analyse the multi-scale gradient distribution 
through its nth-order moments (tin)- In particular, we 
compute the excess of kurtosis using the second and fourth 
moments of the distribution (jj,2 and fJ-4). For a gaussian 
distribution, the normalised excess is zero. For a Laplace 
distribution, the fourth moment is given by /X4 = 6^. The 
normalised excess of kurtosis k = ^//j^ — Q highlights the 
non-gaussian signature of a signal through the departure 
of the multi-scale gradient from a Laplace distribution. 

At each decomposition level, we compute the nor- 
malised excess of kurtosis of the multi-scale gradient co- 
efficients for the 100 gaussian maps, and we derive a rep- 
resentative value of the distribution that is the mean k 
which we quote in Table [l]. The results show that k is 
very close to zero. The a values correspond to the root 
mean square values with respect to the mean k. The a 
values define a confidence interval, or a probability distri- 
bution of the excess of kurtosis. For the gaussian signal 
the upper and lower boundaries of this interval are equal 
suggesting that the k values are gaussian distributed. The 
increasing a (Fig. ||, left panels) with the decomposition 
scale is due to the larger dispersion. This feature is also the 
consequence of the smaller number of wavelet coefficients 
at higher decomposition scales. 

4-2. Partial derivatives 

The wavelet coefficient distributions associated with the 
first and cross derivatives are gaussian for the gaussian 
maps. We thus compute the normalised excess of kurtosis 
with respect to a gaussian distribution (k = Ha/ n\ — 3). 
These values are displayed in figure [| (dashed line) for 



Scale 
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c± 
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0.04 


0.15 


II 


0.05 


0.32 


III 


-0.01 


0.51 


VI 


-0.006 


0.869 



Table 1. The mean excess of kurtosis at four decomposition 
scales computed over the 100 gaussian maps. The a± values 
define the confidence intervals for one realisation. 
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0.010 


0.020 




0.003 
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d/dx 


II 


0.012 


0.043 


d 2 /dxdy 


0.005 


0.042 
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III 


-0.001 


0.079 




0.006 


0.072 


d/dy 


VI 


0.006 


0.150 




0.018 


0.014 



Table 2. The mean excess of kurtosis at four decomposition 
scales. k\ is computed using the wavelet coefficients associated 
with d/dx and d/dy. k 2 is computed using the coefficients of 
d 2 /dxdy. The a± values define the confidence intervals for one 
realisation. 



the gaussian maps. In this figure and at each decomposi- 
tion scale, the first set of 100 values stands for the excess 
of kurtosis of the wavelet coefficients associated with the 
horizontal gradient (d/dy). The second set of 100 values 
represents the same quantity computed for the vertical 
gradient (d/dx) and the last one represents the excess of 
kurtosis for the wavelet coefficients associated with cross 
derivative d 2 /dxdy (diagonal gradients). We note that for 
the gaussian maps the excess is always centred around 
zero at all the decomposition scales. For the multi-scale 
gradient, the dispersion around the mean k increases with 
increasing decomposition scale. In table ^, we quote the 
mean together with the confidence intervals at each scale. 
The results also show that the values are close to zero 
confirming the gaussian nature of the signal. 

As a result, we conclude that a gaussian signal can be 
characterised by the distribution of the multi-scale gra- 
dient coefficients and of the coefficients associated with 
d/dx, d/dy and d 2 /dxdy. In the first case, the multi-scale 
gradient coefficient distribution is fitted by a Laplace dis- 
tribution and the excess of kurtosis is zero. In the second 
case, the excesses of kurtosis are gaussian distributed with 
k = for the first and the cross derivatives. 

We check these two characteristics on other gaussian 
processes with different power spectra. For a white noise 
spectrum, we find identical results as in the study case: 
zero excess of kurtosis for the multi-scale gradient and the 
coefficients of the partial derivatives. Since our statistical 
tests are based on the statistics of the wavelet coefficients 
at each decomposition scale, we expect that a sharp cut 
off in the power spectrum of the gaussian signal will in- 
duce a sample variance problem. We check this behaviour 
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Wavelet Gradient Histogram (Scale I) 



Wavelet Gradient Histogram (Scale I) 
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Wavelet Gradient Histogram (Scale II) 
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Wavelet Gradient Histogram (Scale III) 
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Fig. 3. Fits of the multi-scale gradient coefficient histograms obtained with the 100 statistical realisations. In all plots, the error 
bar is for one realisation. The left panels are given for the gaussian process whereas the right panels represent the non-gaussian 
signal. The wings at large multi-scale gradient indicate the non-gaussian signature. 



using a gaussian process exhibiting a very sharp cut off 
with a shape close to a Heaviside function, at the second 
decomposition scale. This cut off is similar to the cut off 
expected in the CMB power spectrum in a standard cold 
dark matter model. At all the decomposition scales except 
the second, we find the expected zero excess of kurtosis for 
the wavelet coefficients of gaussian signals. At the second 
decomposition scale, we find a non-zero excess of kurto- 



sis for the multi-scale gradient coefficients, as well as for 
the coefficients related to the first derivatives, which could 
be misinterpreted for a non-gaussian signature. This non- 
zero excess has nothing to do with an intrinsic property 
of the studied signal, as the latter is gaussian at all scales. 
It comes from the very sharp decrease in power combined 
with the narrow filter associated with the wavelet basis. 
In fact, the contribution of the gaussian process, at this 
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Fig. 4. Excess of kurtosis computed over the wavelet coefficients of respectively d/dx for the left panels, d/dy for the centre 
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scale, is sparse. Therefore, it induces a sample variance 
effect which in turn results in a non-zero excess of kur- 
tosis. We tested a wider filtering wavelet and found that 
the excess of kurtosis decreases. Nevertheless, a wider fil- 
tering wavelet smoothes the non-gaussian signatures and 
reduces the efficiency of our discriminating tests. However, 
at the second decomposition scale, the wavelet coefficients 
associated with & 1 jdxdy exhibit no excess of kurtosis for 
a process with sharp cut off. This indicates that the ex- 
cess of kurtosis computed using the cross derivative coeffi- 
cients is more reliable in characterising a gaussian process, 
and consequently non-gaussianity, regardless of the power 
spectrum. 

We also analysed the sum of gaussian signals. When 
there is no cut off in the power, the excess of kurtosis for 
the sum of gaussian signals is zero for both discriminators. 
By contrast, if one of the gaussian signals presents a cut 
off at any decomposition scale, we again find a non-zero 
excess of kurtosis at the corresponding scale. 

5. Application to non-gaussian signals 

We apply our statistical discriminators to detect the non- 
gaussian signature of different processes. We first study 
two sets of non-gaussian maps, one constituted of a dis- 
tribution of top-hat profiles and the other constituted of 
a distribution of gaussian profiles, both having the same 
power spectrum as the gaussian test maps used in the pre- 
vious section. We also apply the statistical test to a combi- 
nation of gaussian and non-gaussian signals with different 
mixing ratios. 

5.1. The multi-scale gradient and its distribution 

We compute the multi-scale gradient coefficients (Gl) us- 
ing 100 statistical realisations of the non-gaussian process 
(top- hat profiles). At the four decomposition scales, we 
plot the fitted histogram (right panels of Figure ||). We 
note that the distribution of the multi-scale gradient coef- 
ficients also fits a Laplace distribution for small values of 
Gl- However, there is a significant departure from this dis- 
tribution for higher values. This is exhibited by the larger 
error bars and by the wings of the gradient distribution at 
large G l ■ Figure || (right panels) exhibits the non-gaussian 
signatures mostly at the first three decomposition scales. 
At the fourth, the lack of coefficients enlarges the error 
bars but we still marginally distinguish the non-gaussian 
signal. 

In our test case, the process is lcptokurtic, that is the 
non-gaussianity is characterised by a positive excess of 
kurtosis. We quote, in the left panel of Table || the me- 
dian excesses of kurtosis computed with the multi-scale 
gradient coefficients of the 100 maps. This is a more suit- 
able quantity to characterise a non-gaussian process, than 
the mean k, as there is an important dispersion of the k 
values with a clear excess towards large values. The <r±, 



Scale 


fci 






k 2 




cr_ 


I 


14.50 


35.38 


5.06 


1.5 


0.51 


0.27 


II 


3.87 


2.68 


0.77 


1.9 


1.43 


0.48 


III 


3.54 


2.17 


1.24 


3.46 


2.60 


1.24 


VI 


3.57 


7.13 


1.68 


2.88 


2.55 


1.82 



Table 3. fci is the median excess of kurtosis, at four decompo- 
sition scales, computed with the 100 non-gaussian maps (top- 
hat profile), ki is computed with the the 100 non-gaussian maps 
(gaussian profile). a+ and o- give the confidence interval for 
one realisation. 



which represents the rms excess of kurtosis with respect 
to the median for one realisation, takes naturally into ac- 
count the non symmetric distribution of the multi-scale 
gradients. This results in a lower boundary (<r_) for the 
confidence interval smaller that the upper boundary (cr + ) . 
The latter is biased towards large values as we are studying 
a leptokurtic process. Therefore, the comparison between 
the values of k and c_ indicates the detectability of non- 
gaussianity. When k — <j- for one realisation differs from 
zero by a value of the order of, or larger than, er_ this 
suggests that the signal is non-gaussian. For the top-hat 
profiles, there is an obvious excess of kurtosis at all scales. 

In order to test the non-gaussian signature arising form 
different processes with same power spectra, we analyse 
a set of 100 non-gaussian maps made of the superposi- 
tion of gaussian profiles of different sizes and amplitudes. 
We compute the median value of the excess of kurtosis 
and the corresponding confidence intervals (Tab. ^, right 
panel). We note that k is different from zero at all scales, 
exhibiting the non-gaussian nature of the studied process. 
However, it is smaller than in the case of the top-hat pro- 
files. This decrease is due to the superposition of smoother 
profiles. 

We now add one representative gaussian map to 100 
non-gaussian maps (top-hat profiles) . As the non-gaussian 
signal is very strongly dependent on the studied map, 
it is necessary to span a large set of non-gaussian sta- 
tistical realisations in order to have a reliable statistical 
specification of non-gaussianity. The gaussian and non- 
gaussian signals were summed with different mixing ra- 
tios represented by the ratio of their rms amplitudes 

rms — & gauss / Gnon — gauss) - 

After wavelet decomposition, we compute the multi-scale 
gradient coefficients of the summed maps and derive the 
normalised median excess of kurtosis with respect to a 
Laplace distribution together with the confidence inter- 
vals. The results are quoted in Table ^ as a function of the 
mixing ratio R rm s and the wavelet decomposition scale. 
For R rms = 1, the excess of kurtosis is larger than that 
of the gaussian test map and it is smaller than that of 
the purely non-gaussian signal. The summation of the 
two processes has therefore, as expected, smoothed the 
gradients and diluted the non-gaussian signal. For a non- 
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Ratio 


Scale 


k 


o + 


<7_ 


Rrms — 


l 


I 


6.64 


12.18 


2.36 






II 


1.13 


0.80 


0.41 






III 


0.78 


1.42 


0.57 






VI 


0.62 


2.81 


0.82 


Rrms — 


2 


I 


1.64 


3.42 


0.56 






II 


0.24 


0.39 


0.23 






III 


0.06 


0.77 


0.34 






VI 


0.23 


1.08 


0.58 


Rrms — 


3 


I 


0.36 


0.44 


0.11 






II 


0.09 


0.16 


0.09 






III 


0.04 


0.26 


0.21 






VI 


0.26 


0.37 


0.36 


Rrms — 


4 


I 


0.33 


0.15 


0.06 






II 


0.09 


0.08 


0.06 






III 


0.05 


0.13 


0.13 






VI 


0.27 


0.19 


0.23 



Table 4. The median excess of kurtosis k, at four decompo- 
sition scales, computed over the 100 non-gaussian maps added 
to the gaussian map as a function of the mixing ratios R rm s 
and the decomposition scales. The o+ and <t_ values represent 
respectively the upper and lower boundaries of the confidence 
interval for one realisation. They are the rms values with re- 
spect to the median excess of kurtosis. 





Scale 


fci 


o-+ 


(j— 




k 2 


cr± 




I 


0.92 


0.28 


0.09 




0.65 


0.05 


d/dx 


II 


0.51 


0.08 


0.07 


d 2 jdxdy 


0.35 


0.06 


& 


III 


0.41 


0.15 


0.11 




0.24 


0.11 


d/dy 


VI 


0.43 


0.34 


0.21 




0.28 


0.21 



Table 5. The median excess of kurtosis, at four decomposition 
scales. k\ is computed using the wavelet coefficients associated 
with d/dx and d/dy. k 2 is given for d 2 /dxdy. The 100 non- 
gaussian maps (top-hat profile) have the same power spectrum 
as the gaussian test maps. The cr+ and u_ values are the rms 
values for one realisation with respect to the median excess of 
kurtosis. 



gaussian signal half that of the gaussian signal, only the 
first three scales indicate an excess of kurtosis different 
from the gaussian one. For the ratio i? rms = 3, only the 
first scale has an excess marginally different from the gaus- 
sian signal. For larger ratios, the non-gaussian signal is 
quite blurred. 

5.2. Partial derivatives 

For the 100 non-gaussian maps (top-hat profile) with the 
same power spectrum as the gaussian test maps, we com- 
pute the normalised excess of kurtosis, with respect to a 
gaussian, of the wavelet coefficients associated with d/dx, 
d/dy and d 2 /dxdy. As for the multi-scale gradient, we 
derive the median excess of kurtosis and the upper and 
lower boundaries of the confidence intervals. The results, 



Scale 




ki 


o-+ 






k 2 


er± 


I 




0.19 


0.03 


0.02 




0.10 


0.03 


II 


d/dx 


0.21 


0.07 


0.06 




0.13 


0.05 


III 


& 


0.35 


0.14 


0.13 


d 2 /dxdy 


0.17 


0.10 


VI 


d/dy 


0.24 


0.24 


0.18 




0.21 


0.24 



Table 6. The median excess of kurtosis, at four decomposition 
scales, computed over the 100 non-gaussian maps (gaussian 
profile), ki is the median excess computed with the coefficients 
associated with the vertical and horizontal gradients, and k 2 is 
given for the cross derivative. The a values are the boundaries 
of the confidence interval for one statistical realisation. 



given m Table § show non-zero excesses of kurtosis for 
the first and cross derivatives at all decomposition scales. 

In Figure ||, the solid line represents the values of the 
excess of kurtosis of each non-gaussian realisation. The 
dashed line represents the same quantity for the gaussian 
test maps. We first note the overall shift of the values to- 
wards non-zero positive values (leptokurtic signal), with 
some very large values compared to the median. A sec- 
ond characteristic worth noting is the difference in ampli- 
tudes between, on the one hand, the excess of kurtosis of 
the coefficients associated with d 2 /dxdy and, on the other 
hand, those associated with d/dx and d/dy. The former 
are indeed smaller. As the excess of kurtosis of the first 
derivative coefficients is of the same order, we compute 
one median k over d/dx and d/dy coefficients, and com- 
pare it to the excess of kurtosis of the cross derivative. At 
the first two decomposition scales there is an important 
and noticeable difference between the two sets of values 
d/dx and d/dy, and d 2 /dxdy. At the third and fourth 
decomposition scales, the difference decreases but is still 
present. 

For the non-gaussian process made of the superposi- 
tion of gaussian profiles, we compute the median excess 
of kurtosis associated with the wavelet coefficients of the 
first and cross derivatives (Tab. ||). As for the multi-scale 
gradient coefficients, we find that the excess of kurtosis is 
smaller for this type of non-gaussian maps but it is still 
significantly different from zero at all scales except the 
fourth. 

We analyse the sum of a representative gaussian map 
and the set of 100 non-gaussian maps (top-hat profile). 
The sum of the two processes has been performed again 
with different mixing ratios. The results we obtain are 
given m Table g. An accompanying figure (Fig. |) illus- 
trates the corresponding results for a mixing ratio Rrms = 
1 (solid line). In this figure, the dashed line represents the 
gaussian process alone. For R rms = 1, we find that non- 
gaussianity is detected at all decomposition scales for both 
first and cross derivative coefficients. For a mixing ratio 
Rrms = 2, we observe a significant excess of kurtosis only 
at the first decomposition scale. For R rm s > 3, the excess 
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Fig. 5. Excess of kurtosis computed with the wavelet coefficients ofd/dx and d/dy for respectively the left and centre panels. 
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Rrms 


Scale 




fci 


c+ 






k 2 


o± 


l 


I 




0.44 


0.14 


0.06 




0.25 


0.03 




II 


d/dx 


0.15 


0.06 


0.05 




0.09 


0.03 




III 


& 


0.09 


0.11 


0.08 


d 2 /dxdy 


0.03 


0.08 




VI 


d/dy 


0.10 


0.22 


0.16 




-0.06 


0.08 


2 


I 




0.13 


0.06 


0.03 




0.07 


0.02 




II 


d/dx 


0.04 


0.04 


0.04 




-0.002 


0.030 




III 


& 


-0.002 


0.090 


0.068 


d 2 /dxdy 


-0.07 


0.06 




VI 


d/dy 


0.03 


0.13 


0.13 




-0.20 


0.08 


3 


I 




0.05 


0.04 


0.03 




0.03 


0.02 




II 


d/dx 


0.02 


0.04 


0.04 




-0.02 


0.02 




III 


& 


-0.007 


0.078 


0.071 


d 2 /dxdy 


-0.10 


0.04 




VI 


d/dy 


0.01 


0.11 


0.11 




-0.26 


0.06 


4 


I 




0.03 


0.04 


0.03 




0.03 


0.01 




II 


d/dx 


0.02 


0.04 


0.04 




-0.02 


0.02 




III 


& 


-0.006 


0.072 


0.074 


d 2 /dxdy 


-0.11 


0.03 




VI 


d/dy 


0.003 


0.097 


0.078 




-0.27 


0.04 



Table 7. The median excess of kurtosis, at four decomposition scales, for 100 non-gaussian maps (top-hat profile) added to one 
gaussian map (with same power spectrum) as a function of the mixing ratio R rm s- fci is computed with the coefficients associated 
with d/dx and d/dy, and k 2 is given for d 2 / dxdy . The a values are the boundaries of the confidence interval for one statistical 
realisation. 



becomes marginal for both d/dx and d/dy coefficients at 
the first decomposition scale, all other scales showing no 
departure from gaussianity. The same tendency is noted 
for the coefficients associated with d 2 /dxdy. 

6. Detection strategy of non-gaussianity 

We have characterised the gaussian signal through the ex- 
cess of kurtosis of the multi-scale gradient and partial (first 
and cross) derivative coefficients using processes with dif- 
ferent power spectra. When the power spectrum of the 
process exhibits a sharp cut off in one of the wavelet de- 
composition windows we find that the excess of kurtosis, 
associated with the coefficients of the first derivatives, and 
consequently of the multi-scale gradient, is non-zero at the 
filtering level of the cut off. On the contrary, the excess of 
kurtosis computed with the cross derivative coefficients is 
zero. 

Accordingly, we propose a detection strategy to test for 
non-gaussianity. We compare a set of maps of the "real" 
observed sky to a set of gaussian realisations having the 
power spectrum of the "real sky" . Our proposed method 
overcomes the problems arising from eventual cut off in 
the power spectrum of the studied process, and the con- 
sequent possible misinterpretations on the statistical sig- 
nature. It also constitutes the most general approach to 
exhibit the statistical nature (gaussian or not) of a signal 
and quantify its detectability through our statistical tests. 
Our detection strategy of the non-gaussianity is based on 
the following steps: 

• using observed maps of the "real sky" we compute 
the angular power spectrum of the signal, regardless of its 
statistical nature. 



• We simulate gaussian synthetic realisations of a pro- 
cess having the power spectrum of the "real" process. On 
the obtained gaussian test maps filtered with the wavelet 
function, we compute the excess of kurtosis for the multi- 
scale gradient and derivative coefficients. This analysis al- 
lows us to characterise completely the gaussian maps, nat- 
urally taking into account eventual sample variance effects 
due to cut off at any scale. 

• For the set of observed maps of the "real sky" , we 
compute the excesses of kurtosis associated with the multi- 
scale gradient, and the derivative coefficients. 

• Assuming that the realisations (maps) are indepen- 
dent, each value of the excess of kurtosis has a probabil- 
ity of 1/N, where N is the number of maps. Using the 
computed excesses of kurtosis of both the gaussian and 
non-gaussian realisations, we deduce the probability dis- 
tribution function (PDF) of the excess of kurtosis, for the 
multi-scale gradient coefficients and for the coefficients re- 
lated to the derivatives. 

• The last step consists of quantifying the detectabil- 
ity of the non-gaussian signature. That is to compare the 
PDF of the gaussian process to the PDF of the "real sky" . 
In practice this can be done by computing at each decom- 
position scale the probability that the median excess of 
kurtosis of the non-gaussian maps belongs to the PDF of 
the synthetic gaussian counterparts. It is the probability 
that a random variable is greater or equal to the real me- 
dian k. We take k which is the asymptotic value given by 
the central limit theorem. Another way of comparing the 
two PDFs is to use the Kolmogorov-Smirnov (K-S) test 
(Press et al. 1992) which gives the probability for two dis- 
tributions to be identical. This test for non-gaussianity is 
more global than the previous test because it is sensitive 
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to the shift in the PDFs, especially the median value, and 
to the spread of the distributions. This property makes 
it more sensitive to non-gaussianity especially in the case 
where we only have a small number of observed maps. 

We apply our detection strategy of non-gaussianity to 
the non-gaussian test maps constituted of the top-hat and 
gaussian profiles. For illustrative purposes, we give the re- 
sults of the multi-scale gradient coefficient only. At the 
first three decomposition scales and for both sets of maps, 
we find that the probability for the signal to be non- 
gaussian is 100% using the probability of the measured 
k to belong to the gaussian PDF. At the fourth scale, the 
probability is 99.99% and 99.95% for respectively the top- 
hat and gaussian profile distribution. The K-S test gives 
a 100% probability of detecting non-gaussianity. The de- 
tcctability of the non-gaussian signature, for the sum of 
the gaussian and non-gaussian (top-hat profile) maps with 
mixing ratio i? rms = 1, is 100% at the first decomposition 
scale. It is 99.96% and 93.4% at the second and third scale, 
and 76.43% at the fourth scale. For R rms — 2, the first 
scale is still perfectly non-gaussian, and only the second 
scale is detected with a probability of 72.4%. The K-S test 
gives more or less the same results for both mixing ratios. 
The results are illustrated in Figure p| (for the multi-scale 
gradient coefficients) and in Figure R(for the cross deriva- 
tive coefficients). In these plots, the solid line represents 
the PDF of the excess of kurtosis for the non-gaussian 
measured signal. The dashed line represents the PDF of 
the synthetic gaussian maps with same power spectrum. 
In both figures the left panels are for the non-gaussian sig- 
nal alone, whereas the right panels are for the sum with a 
mixing ratio of one. 

7. Discussion and Conclusions 

In the present work we develop two statistical discrim- 
inators to test for non-gaussianity. To do so, we study 
the statistical properties of the coefficients in a four level 
dyadic wavelet decomposition. 

Our first discriminator uses the amplitude of the multi- 
scale gradient coefficients. It is based on the computation 
of their excesses of kurtosis with respect to a Laplace dis- 
tribution function. The second test relies on the compu- 
tation of the excesses of kurtosis for the first and cross 
derivative coefficients. It can itself be divided into one spe- 
cific test using the first derivatives and the other the cross 
derivative. For both discriminators (multi-scale gradient 
and partial derivatives), the gaussian signature is charac- 
terised by a zero excess of kurtosis. We check this property 
for several gaussian processes with different power spec- 
tra and for a signal made of the sum of gaussian signals. 
Given this property for gaussianity, the departure form 
a zero value of the excess of kurtosis indicates the non- 
gaussian signature. 



Excess of kurtosis PDF Excess of kurtosis PDF 




-10 10 20 30 40 -5 5 10 15 

Scale (IV) Scale (IV) 



Fig. 6. Probability distribution functions of the excess of kur- 
tosis, as percentages, computed with the multi-scale gradient 
coefficients. In the two panels, the dashed line represents PDF 
of the gaussian test maps. In the left panels, the solid line rep- 
resents the non-gaussian signal alone (top-hat profiles). In the 
right panels, the solid line represents the sum of the gaussian 
and non-gaussian processes with R r ms = 1. 

In order to overcome peculiar features in the power 
spectrum (e.g. sharp cut offs) at any wavelet decompo- 
sition scale, which could be misinterpreted for a non- 
gaussian signature, we propose the following detection 
strategy. We simulate synthetic gaussian maps with the 
same power spectrum as the non-gaussian studied signal. 
We compute the excess of kurtosis for the two discrimi- 
nators, and for both gaussian and non-gaussian maps. We 
derive the PDF in each case. Then, we quantify the de- 
tectability of non-gaussianity by estimating the probabil- 
ity that the median excess of kurtosis of the non-gaussian 
signal belongs to the PDF of the gaussian counterpart, and 
by applying the K-S test to discriminate the gaussian and 
the "real" PDFs. We apply our detection strategy to the 
test maps of non-gaussian signals alone, and to the sum 
of gaussian and non-gaussian signals. In the first case, we 
show that the non-gaussian signature emerges clearly at 
all scales. In the second case, the detection depends on the 
mixing ratio (a gauss /cr non _ ffa „ ss ). Down to a mixing ratio 
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Fig. 7. Probability distribution functions of the excess of kur- 
tosis, as percentages, computed with the coefficients associated 
with the cross derivative. In the two panels, the dashed line 
represents PDF of the gaussian test maps. In the left panels, 
the solid line represents the non-gaussian signal alone (top-hat 
profiles). In the right panels, the solid line represents the sum 
of the gaussian and non-gaussian processes with R rm .s = 1. 



of about 3, which is about 10 in term of power, we detect 
the non-gaussian signature. 

In parallel to our work, Hobson et al. (1998) have used 
the wavelet coefficients to distinguish the non-gaussianity 
due to the Kaiser-Stebbins effect (Bouchet et al 1988; 
Stebbins 1988) of cosmic strings. They used the cumulants 
of the wavelet coefficients up to the fourth order (Ferreira 
et al 1997), in a pyramidal decomposition. As mentioned 
in section 



2.1 and in Hobson et al. (1998), the pyrami- 
dal decomposition induces a scale mixing. Therefore, it 
does not take advantage of the possible spatial correla,- 



tions olf the signal Furthermore, it gives smaller numbers 
of coefficients within each sub-band for the analysis. We 
instead use the dyadic decomposition to avoid these two 
weaknesses, as in Aghanim et al. (1998). 

In our study, we use weakly non-gaussian simulated 
maps (small kurtosis). Such a weak non-gaussian signa- 
ture, by contrast with the Kaiser-Stebbins effect, and with 
the point-like or peaked profiles, is detected using our 



statistical discriminators. Using the bi-orthogonal wavelet 
transform, we succeed in emphasising the non-gaussianity, 
by the means of the statistics of the wavelet coefficient dis- 
tributions. This detection is also possible using other bi- 
orthogonal wavelet bases, but their efficiency is lower at 
larger scales. Consequently, the choice of the wavelet ba- 
sis depends also on the characteristics of the non-gaussian 
signal one wants to emphasise. However we believe that 
the wavelet basis we choose represents an optimal com- 
promise for a large variety of non-gaussian features. 
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